Defer MonitorUpdatingPersister writes to flush() #4317

joostjager · 2026-01-15T14:02:47Z

Update MonitorUpdatingPersister and MonitorUpdatingPersisterAsync to queue persist operations in memory instead of writing immediately to disk. The Persist trait methods now return ChannelMonitorUpdateStatus:: InProgress and the actual writes happen when flush() is called.

This fixes a race condition that could cause channel force closures: previously, if the node crashed after writing channel monitors but before writing the channel manager, the monitors would be ahead of the manager on restart. By deferring monitor writes until after the channel manager is persisted (via flush()), we ensure the manager is always at least as up-to-date as the monitors.

Key changes:

Add PendingWrite enum to represent queued write/remove operations
Add pending_writes queue to MonitorUpdatingPersisterAsyncInner
Add flush() to Persist trait and ChainMonitor
Update Persist impl to queue writes and return InProgress
Call flush() in background processor after channel manager persistence
Remove unused event_notifier from AsyncPersister

ldk-reviews-bot · 2026-01-15T14:02:50Z

👋 Hi! I see this is a draft PR.
I'll wait to assign reviewers until you mark it as ready for review.
Just convert it out of draft status when you're ready for review!

TheBlueMatt · 2026-01-15T14:26:07Z

lightning/src/util/persist.rs

 use core::mem;
 use core::ops::Deref;
-use core::pin::{pin, Pin};
+use core::pin::pin;


I think we should be able to do this without touching persist.rs.

Where would we do it then? Queue in ChainMonitor, in KVStore, or somewhere else?

ChainMonitor would need to store the actual monitor data to defer writes, not just track update IDs as it does now. This means either cloning expensive ChannelMonitor objects or storing serialized bytes, which leaks persistence format details into ChainMonitor?

joostjager · 2026-01-16T12:07:19Z

ldk-node's channel_full_cycle test now passes with deferred monitor writes. This validates that channel open, bidirectional payments, and close all work correctly when persist methods return InProgress and actual writes happen on flush(). The critical piece is calling channel_monitor_updated after each flush to unblock the channel - without this, channels would stay paused forever waiting for persistence confirmation.

Not yet tested: node restart scenarios where the manager/monitor ordering invariant matters, which is the primary motivation for this change.

Update MonitorUpdatingPersister and MonitorUpdatingPersisterAsync to queue persist operations in memory instead of writing immediately to disk. The Persist trait methods now return ChannelMonitorUpdateStatus:: InProgress and the actual writes happen when flush() is called. This fixes a race condition that could cause channel force closures: previously, if the node crashed after writing channel monitors but before writing the channel manager, the monitors would be ahead of the manager on restart. By deferring monitor writes until after the channel manager is persisted (via flush()), we ensure the manager is always at least as up-to-date as the monitors. The flush() method takes a count parameter specifying how many queued writes to flush. The background processor captures the queue size before persisting the channel manager, then flushes exactly that many writes afterward. This prevents flushing monitor updates that arrived after the manager state was captured. Key changes: - Add PendingWrite enum with FullMonitor and Update variants for queued writes - Add pending_writes queue to MonitorUpdatingPersisterAsyncInner - Add pending_write_count() and flush(count) to Persist trait and ChainMonitor - ChainMonitor::flush() calls channel_monitor_updated for each completed write - Stale update cleanup happens in flush() after full monitor is written - Call flush() in background processor after channel manager persistence Co-Authored-By: Claude Opus 4.5 <[email protected]>

joostjager · 2026-01-16T14:03:33Z

The ldk-node payment benchmark (1000 payments, 10 iterations) shows a ~14% slowdown with deferred writes:

main: 9.65s median
deferred: 10.96s median

This is expected since we now queue writes in memory and flush them after channel manager persistence, adding overhead to the persist cycle. The tradeoff is correctness - ensuring monitors are never ahead of the manager on disk after a crash.

joostjager force-pushed the mon-barrier branch from 79e9390 to c8405e2 Compare January 15, 2026 14:09

TheBlueMatt reviewed Jan 15, 2026

View reviewed changes

joostjager force-pushed the mon-barrier branch from c8405e2 to 40e909a Compare January 15, 2026 15:48

joostjager added this to Weekly Goals Jan 15, 2026

joostjager self-assigned this Jan 15, 2026

joostjager force-pushed the mon-barrier branch 3 times, most recently from 93ff6c9 to 181a6e0 Compare January 16, 2026 09:40

joostjager force-pushed the mon-barrier branch from 181a6e0 to a63bd21 Compare January 16, 2026 12:09

joostjager force-pushed the mon-barrier branch from a63bd21 to 6ed79d9 Compare January 16, 2026 12:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Defer MonitorUpdatingPersister writes to flush() #4317

Defer MonitorUpdatingPersister writes to flush() #4317

joostjager commented Jan 15, 2026

Uh oh!

ldk-reviews-bot commented Jan 15, 2026

Uh oh!

TheBlueMatt Jan 15, 2026

Uh oh!

joostjager Jan 15, 2026

Uh oh!

joostjager Jan 16, 2026

Uh oh!

joostjager commented Jan 16, 2026

Uh oh!

joostjager commented Jan 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Defer MonitorUpdatingPersister writes to flush() #4317

Are you sure you want to change the base?

Defer MonitorUpdatingPersister writes to flush() #4317

Conversation

joostjager commented Jan 15, 2026

Uh oh!

ldk-reviews-bot commented Jan 15, 2026

Uh oh!

TheBlueMatt Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

joostjager Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

joostjager Jan 16, 2026

Choose a reason for hiding this comment

Uh oh!

joostjager commented Jan 16, 2026

Uh oh!

joostjager commented Jan 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants